Search of Performance Inefficiencies in Message Passing Applications with KappaPI 2 Tool

نویسندگان

  • Josep Jorba
  • Tomàs Margalef
  • Emilio Luque
چکیده

Performance is a crucial issue of parallel/distributed applications. One kind of useful tools, in this context, are the automatic performance analysis tools, that help developers in some of the phases of the performance tuning process. KappaPI 2 is an automatic performance tool, with open knowledge about typical inefficiencies in message passing applications, and it is able to detect and analyze these inefficiencies, and then make suggestions to the developer about how to improve their application behavior.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SCALASCA Parallel Performance Analyses of SPEC MPI2007 Applications

The SPEC MPI2007 1.0 benchmark suite provides a rich variety of message-passing HPC application kernels to compare the performance of parallel/distributed computer systems. Its 13 applications use a representative cross-section of programming languages (C/C++/ Fortran, often combined) and MPI programming patterns (e.g., blocking vs. non-blocking vs. persistent point-to-point communication, with...

متن کامل

Title: Infrastructure for Performance Tuning Mpi Applications Infrastructure for Performance Tuning Mpi Applications

An abstract of the thesis of Kathryn Marie Mohror for the Master of Science in Computer Science presented November 13, 2003. Title: Infrastructure For Performance Tuning MPI Applications Clusters of workstations are becoming increasingly popular as a low-budget alternative for supercomputing power. In these systems, message-passing is often used to allow the separate nodes to act as a single co...

متن کامل

1 Performance Tool Support for MPI - 2 on Linux 1

Programmers of message-passing codes for clusters of workstations face a daunting challenge in understanding the performance bottlenecks of their applications. This is largely due to the vast amount of performance data that is collected, and the time and expertise necessary to use traditional parallel performance tools to analyze that

متن کامل

Protocol-Dependent Message-Passing Performance on Linux Clusters

In a Linux cluster, as in any multi-processor system, the inter-processor communication rate is the major limiting factor to its general usefulness. This research is geared toward improving the communication performance by identifying where the inefficiencies lie and trying to understand their cause. The NetPIPE utility is being used to compare the latency and throughput of all current message-...

متن کامل

TorusBFS: A Novel Message-passing Parallel Breadth-First Search Architecture on FPGAs

Graphs are a fundamental data structure used extensively in numerous domains. In graph-based applications, Breadth-First Search (BFS) is a key component which suffers from long latency of memory accesses. In this paper, we present a novel message passing parallel BFS architecture namely TorusBFS on field-programmable gate array (FPGA). By utilizing the on-chip memories to store the visitation s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006